Towards Understanding the Generalization Bias of Two Layer Convolutional Linear Classifiers with Gradient Descent

نویسندگان

Yifan Wu

Barnabás Póczos

Aarti Singh

چکیده

A major challenge in understanding the generalization of deep learning is to explain why (stochastic) gradient descent can exploit the network architecture to find solutions that have good generalization performance when using high capacity models. We find simple but realistic examples showing that this phenomenon exists even when learning linear classifiers — between two linear networks with the same capacity, the one with a convolutional layer can generalize better than the other when the data distribution has some underlying spatial structure. We argue that this difference results from a combination of the convolution architecture, data distribution and gradient descent, all of which are necessary to be included in a meaningful analysis. We provide a general analysis of the generalization performance as a function of data distribution and convolutional filter size, given gradient descent as the optimization algorithm, then interpret the results using concrete examples. Experimental results show that our analysis is able to explain what happens in our introduced examples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

Traffic Sign Recognition Using Extreme Learning Classifier with Deep Convolutional Features

Traffic sign recognition is an important but challenging task, especially for automated driving and driver assistance. Its accuracy depends on two aspects: feature extractor and classifier. Current popular algorithms mainly use convolutional neural networks (CNN) to execute both feature extraction and classification. Such methods could achieve impressive results but usually on the basis of an e...

متن کامل

Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning

Effective convolutional neural networks are trained on large sets of labeled data. However, creating large labeled datasets is a very costly and time-consuming task. Semi-supervised learning uses unlabeled data to train a model with higher accuracy when there is a limited set of labeled data available. In this paper, we consider the problem of semi-supervised learning with convolutional neural ...

متن کامل

Online learning from nite training sets androbustness to input

We analyse online gradient descent learning from nite training sets at non-innnitesimal learning rates. Exact results are obtained for the time-dependent generalization error of a simple model system: a linear network with a large number of weights N, trained on p = N examples. This allows us to study in detail the eeects of nite training set size on, for example, the optimal choice of learning...

متن کامل

Learning Compact Neural Networks with Regularization

We study the impact of regularization for learning neural networks. Our goal is speeding up training, improving generalization performance, and training compact models that are cost efficient. Our results apply to weight-sharing (e.g. convolutional), sparsity (i.e. pruning), and low-rank constraints among others. We first introduce covering dimension of the constraint set and provide a Rademach...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1802.04420 شماره

صفحات -

تاریخ انتشار 2018

Towards Understanding the Generalization Bias of Two Layer Convolutional Linear Classifiers with Gradient Descent

نویسندگان

چکیده

منابع مشابه

Learning Document Image Features With SqueezeNet Convolutional Neural Network

Traffic Sign Recognition Using Extreme Learning Classifier with Deep Convolutional Features

Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning

Online learning from nite training sets androbustness to input

Learning Compact Neural Networks with Regularization

عنوان ژورنال:

اشتراک گذاری